Linguistic profiling of texts for the purpose of language verification

نویسندگان

  • Hans van Halteren
  • Nelleke Oostdijk
چکیده

In order to control the quality of internet-based language corpora, we developed a method to verify automatically that texts are of (near-) native quality. For the LOCNESS and ICLE corpora, the method is rather successful in separating native and non-native learner texts. The Equal Error Rate is about 10%. However, for other domains, such as internet texts, separate classifiers have to be trained on the basis of suitable seed corpora.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Language Features of Russian Texts of Engineering Discourse

The Article is devoted to the applied problem of identifying the linguistic features of engineering texts. The study of Russian-language texts of engineering discourse is usually of an applied nature, in our case, this applied research is caused by the need to teach foreigners who receive professional engineering education in Russia and in Russian language. The object of the research is the Rus...

متن کامل

From Academic to Journalistic Texts: A Qualitative Analysis of the Evaluative Language of Science

This study examined academic articles and journalistic reports in 5 disciplinary areas to explore how similar contents might attitudinally be realized in two different genres. To this end, 25 research articles and 210 news reports were carefully selected and underwent detailed discourse semantic and grammatical analyses with the purpose of identifying the evaluative linguistic patterns....

متن کامل

Textual Enhancement across Linguistic Structures: EFL Learners' Acquisition of English Forms

The benefits of textual input enhancement in the acquisition of linguistic forms have produced mixed results in SLA literature. The present study investigates the effects of textual enhancement on adult foreign language intake of two English linguistic forms-subjunctive mood and inversion structures-to explore the role of the type of linguistic items in input enhancement studies. It also invest...

متن کامل

Linguistic Profiling for Authorship Recognition and Verification

A new technique is introduced, linguistic profiling, in which large numbers of counts of linguistic features are used as a text profile, which can then be compared to average profiles for groups of texts. The technique proves to be quite effective for authorship verification and recognition. The best parameter settings yield a False Accept Rate of 8.1% at a False Reject Rate equal to zero for t...

متن کامل

Gender-preferential Linguistic Elements in Applied Linguistics Research Papers: Partial Evaluation of a Model of Gendered Language

This article intended to investigate whether the gender-preferential linguistic elements found by Argomon, Koppel, Fine and Shimoni (2003) show the same gender-linked frequencies in applied linguistics research papers written by non-native speakers of English. In so doing, a sample of 32 articles from different journals was collected and the proportion of the targeted features to the whole numb...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004